10. TD Control: Sarsamax
TD Control: Sarsamax
Check out this (optional) research paper to read the proof that Sarsamax (or Q-learning) converges.
TD Control: Sarsamax
Check out this (optional) research paper to read the proof that Sarsamax (or Q-learning) converges.